Goto

Collaborating Authors

 latent gaussian model



Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond

Neural Information Processing Systems

Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional. We prototype the method in the probabilistic programming framework Stan and test the utility of the embedded Laplace approximation on several models, including one where the dimension of the hyperparameter is 6,000. Depending on the cases, the benefits can include an alleviation of the geometric pathologies that frustrate Hamiltonian Monte Carlo and a dramatic speed-up.





Gaussian Invariant Markov Chain Monte Carlo

Titsias, Michalis K., Alexopoulos, Angelos, Liu, Siran, Dellaportas, Petros

arXiv.org Machine Learning

We develop sampling methods, which consist of Gaussian invariant versions of random walk Metropolis (RWM), Metropolis adjusted Langevin algorithm (MALA) and second order Hessian or Manifold MALA. Unlike standard RWM and MALA we show that Gaussian invariant sampling can lead to ergodic estimators with improved statistical efficiency. This is due to a remarkable property of Gaussian invariance that allows us to obtain exact analytical solutions to the Poisson equation for Gaussian targets. These solutions can be used to construct efficient and easy to use control variates for variance reduction of estimators under any intractable target. We demonstrate the new samplers and estimators in several examples, including high dimensional targets in latent Gaussian models where we compare against several advanced methods and obtain state-of-the-art results. We also provide theoretical results regarding geometric ergodicity, and an optimal scaling analysis that shows the dependence of the optimal acceptance rate on the Gaussianity of the target.


Review for NeurIPS paper: Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond

Neural Information Processing Systems

This paper develops an inference method for Gaussian latent variable models that employs a Laplace approximation marginalize over latent variables and infer hyperparameters using HMC. The authors use an adjoint method to efficiently compute the gradient with respect to the hyperparameters. The main contribution is that inference can scale to hyperparameters that have a high dimensionality ( 1000). This paper was overall well-received by reviewers, who remained on balance in favor of acceptance after the author response. The main outstanding points of criticism, which the AC would like to encourage the authors to address are that: (1) the authors should more clearly motivate the use case for latent Gaussian models with a large number of parameters (2) discussion of recent advances in variational inference for GPs is warranted (and some form of comparison would be appreciated).


Review for NeurIPS paper: Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond

Neural Information Processing Systems

Weaknesses: My main questions regarding the paper: 1) When computing the Laplace approximation, this still requires calculation of the Hessian, which I believe is with respect to the latent (theta). This is referred to as W in Algorithm 1. Would it be possible to comment further on the kind of trade-off between implementing full-HMC, versus the overhead of calculating the Hessian. I think this is the issue you are referring to in the second paragraph of the discussion section, whereby you mention higher-order automatic differentiation. I assume you stick to analytical Hessians (e.g. For example "Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models" by Zhang and Sutton jointly sample over hyperparameters and parameters to overcome similar funnel-like behaviours to that of the Gaussian latent variable models that you explore.


Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness

Johansson, Anton, Ramaswamy, Arunselvan

arXiv.org Artificial Intelligence

The development of robust generative models for highly varied non-stationary time series data is a complex yet important problem. Traditional models for time series data prediction, such as Long Short-Term Memory (LSTM), are inefficient and generalize poorly as they cannot capture complex temporal relationships. In this paper, we present a probabilistic generative model that can be trained to capture temporal information, and that is robust to data errors. We call it Time Deep Latent Gaussian Model (tDLGM). Its novel architecture is inspired by Deep Latent Gaussian Model (DLGM). Our model is trained to minimize a loss function based on the negative log loss. One contributing factor to Time Deep Latent Gaussian Model (tDLGM) robustness is our regularizer, which accounts for data trends. Experiments conducted show that tDLGM is able to reconstruct and generate complex time series data, and that it is robust against to noise and faulty data.


Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond

Neural Information Processing Systems

Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional.